1 |
Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals ...
|
|
|
|
Abstract:
The ability to sequence unordered events is an essential skill to comprehend and reason about real world task procedures, which often requires thorough understanding of temporal common sense and multimodal information, as these procedures are often communicated through a combination of texts and images. Such capability is essential for applications such as sequential task planning and multi-source instruction summarization. While humans are capable of reasoning about and sequencing unordered multimodal procedural instructions, whether current machine learning models have such essential capability is still an open question. In this work, we benchmark models' capability of reasoning over and sequencing unordered multimodal instructions by curating datasets from popular online instructional manuals and collecting comprehensive human annotations. We find models not only perform significantly worse than humans but also seem incapable of efficiently utilizing the multimodal information. To improve machines' ... : In Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022 ...
|
|
Keyword:
Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2110.08486 https://dx.doi.org/10.48550/arxiv.2110.08486
|
|
BASE
|
|
Hide details
|
|
14 |
Extracting Dynamic Evidence Networks
|
|
|
|
In: DTIC AND NTIS (2004)
|
|
BASE
|
|
Show details
|
|
15 |
Challenges in Information Retrieval and Language Modeling
|
|
|
|
In: R. Manmatha (2003)
|
|
BASE
|
|
Show details
|
|
16 |
Challenges in Information Retrieval and Language Modeling
|
|
|
|
In: Andrew McCallum (2003)
|
|
BASE
|
|
Show details
|
|
18 |
Algorithms That Learn to Extract Information BBN: Description of the Sift System as Used for MUC-7
|
|
|
|
In: DTIC (1998)
|
|
BASE
|
|
Show details
|
|
20 |
BBN: Description of the PLUM System as Used for MUC-6
|
|
|
|
In: DTIC (1995)
|
|
BASE
|
|
Show details
|
|
|
|